Tuning the Pentium Pro microarchitecture
نویسنده
چکیده
in tel Corporation %is inside look at a large microprocessor development project reveals some of the reasoning vor goals, changes, trade-osfsi andJerformance simulation) that lay behind itsjnal form. esigning a wholly new microprocessor is dfficult and expensive. To jus-D tlfy this effort, a major new microarchitecture must improve performance one and a half or two times over the previous generation microarchitecture, when evaluated on equivalent process technology. In addition, semiconductor process technology continues to evolve while the processor design is in progress. The previous-generation microarchitecture increases in clock speed and performance due to compactions and conversion to newer technology. A new microarchitecture must " intercept " the process technology to achieve a compounding of process and microarchitectural speedups. The process technology, degree of pipelining, and amount of effort a team is willing to spend on circuit and layout issues determine the clock speed of a microarchi-tecture. Typically, a mlcroarchitecture will start with the same clock speed as a prior microarchitecture (adjusted for process technology scaling). This enables the maximum reuse of past designs and circuits, and fits the new design to the existing product development tools and methodology. Performance enhancements should come primarily from the microarchitecture and not from clock speed enhancements per se. Often, a new processor's die area is close to the maximum that can be manufactured. This design choice stems from marketplace competitiveness and efforts to get as much performance as possible in the new microar-chitecture. While making the die smaller and cheaper and improving performance are desirable, it is generally not possible to achieve a 1.5-to-2-times-better performance goal without using at least 1.5 to 2 times a prior design's transistors. Finally, new processor designs often incorporate new features. As the performance of the core logic improves, designs must continue to enhance the bus and cache architecture to keep pace with the core Further, as other technologies (such as mul-tiprocessing) mature, there is a natural tendency to draw them into the processor design as a way of providing additional features and value for the end user The large installed base and broad range of applications for the Intel architecture place additional constraints on the design, constraints beyond the purely academic ones of performance and clock frequency We do not have the flexibility to control software applications, compilers, or operating systems in the same way system vendors can We cannot remove obsolete features and must cater to a …
منابع مشابه
Effectiveness and Limitations of Embedded Counter Based Performance Analysis
This paper presents an experimental study on the performance of the Intel Pentium Pro microprocessor using embedded performance counters. The counters enable detailed run-time analysis of branching and memory subsystem performance, and are accessed through a custom designed tool. The study uses Windows NT and realistic benchmarks including BAPco’s Sysmark32 suite, the Ziff-Davis Winstone97 PC B...
متن کاملPentium III Processor Implementation Tradeoffs
This paper discusses the implementation tradeoffs of the Pentium III processor. The Pentium III processor implements a new extension of the IA-32 instruction set called the Internet Streaming Single-Instruction, MultipleData (SIMD) Extensions (Internet SSE). The processor is based on the Pentium Pro processor microarchitecture. The initial development goals for the Pentium III processor were ...
متن کاملPentium® Pro Processor Design for Test and Debug
1 Its microarchitecture forms the basis of the company's future high-volume microprocessor portfolio. The initial design has also been augmented with enhancements such as MMX technology and compacted onto newer fabrication processes to create the Pentium II processor product line. 2 The original Pentium Pro processor design , known initially as the P6, introduced several performance features, i...
متن کاملThe Microarchitecture of the Intel ® Pentium ®
This paper describes the first Intel Pentium 4 processor manufactured on the 90nm process. We briefly review the NetBurst microarchitecture and discuss how this new implementation retains its key characteristics, such as the execution trace cache and a 2x frequency execution core designed for high throughput. This Pentium 4 processor improves upon the performance of prior implementations of the...
متن کاملPerformance Characterization of the Pentium(r) Pro Processor
In this paper, we characterize the performance of several business and technical benchmarks on a Pentium Pro processor based system. Various architectural data are collected using a performance monitoring counter tool. Results show that the Pentium Pro processor achieves significantly lower cycles per instruction than the Pentium processor due to its out of order and speculative execution, and ...
متن کاملA New Approach to Determining the Time-Stamping Counter's..
Due to its moderate overhead and small quantization error, the time-stamping counter is currently the most precise time-measuring mechanism on Intel 80X86-based Platform. On the Pentium processors, we can simply use a conventional null benchmark to determine the time-stamping counter’s overhead accurately. Similarly, on the Pentium Pro processors, Intel also recommends the same method for measu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Micro
دوره 16 شماره
صفحات -
تاریخ انتشار 1996